
STOP 



Early Journal Content on JSTOR, Free to Anyone in the World 

This article is one of nearly 500,000 scholarly works digitized and made freely available to everyone in 
the world by JSTOR. 

Known as the Early Journal Content, this set of works include research articles, news, letters, and other 
writings published in more than 200 of the oldest leading academic journals. The works date from the 
mid-seventeenth to the early twentieth centuries. 

We encourage people to read and share the Early Journal Content openly and to tell others that this 
resource exists. People may post this content online or redistribute in any way for non-commercial 
purposes. 

Read more about Early Journal Content at http://about.jstor.org/participate-jstor/individuals/early- 
journal-content . 



JSTOR is a digital library of academic journals, books, and primary source objects. JSTOR helps people 
discover, use, and build upon a wide range of content through a powerful research and teaching 
platform, and preserves this content for future generations. JSTOR is part of ITHAKA, a not-for-profit 
organization that also includes Ithaka S+R and Portico. For more information about JSTOR, please 
contact support@jstor.org. 



360 American Statistical Association. [74 



THE CENSUS AGE QUESTION. 

By Alltn A. Young, Professor of Economics, Leland Stanford Junior 

University. 



I. 

In these Publications for June, 1910 (pp. 110-123), Professor 
William B. Bailey and Mr. Julius H. Parmelee printed the 
results of a study of the age returns of the census of 1900. 
They have brought to light a number of new and interesting 
facts which must be taken into account by every serious 
student of our population census. Even more important are 
the conclusions to which the interpretation of these new 
results has led them, for they have decided that the fact that 
in 1900 (for the first time in the history of the federal census) 
the enumerators were required to ascertain the year and 
month of birth of every person enumerated in addition to the 
age, did not materially affect the quality of the age statistics 
of that census. 

This conclusion is of interest because it is sharply at varir 
ance with former views on the subject. But it gains further 
importance from the fact that it led to the omission of the 
date of birth question in the census of 1910. These consid- 
erations may be held to justify an analysis of the report, with 
a view to ascertaining the roots of its inferences and testing 
the validity of its conclusions. Before entering upon this 
task it will be well, I think, to outline some of the various 
considerations that might lead one to imagine that the date 
of birth inquiry should have had some beneficial effect upon 
the age statistics in question. These considerations are of 
unequal weight, and I have arranged them roughly in what 
seems to me the inverse order of their importance. 

1. It may be taken, I think, as a fair and reasonable general 
presumption, that a question about a thing so precise and 
definite as the year and month of one's birth will elicit in 
general more accurate answers than a question as to age, 
even when enumerators are especially cautioned to guard 



75] The Census Age Question. 361 

against loose and inexact answers and against the tendency 
to state age in round numbers. It frequently happens, I have 
found, that after persons have passed a certain age and the 
years have begun to slip by rapidly, their age in years is not 
kept definitely in mind, and an answer to an inquiry as to 
age is apt to involve a mental subtraction of the present 
year from the year of birth, as being on the whole easier than 
an attempt to bring to account the fugitive memory of 
age in years. I am not disposed to lay much stress on this 
consideration, for there is no way of determining how preva- 
lent the habit mentioned is. But even if in the great majority 
of cases one's age is more definitely kept in mind than one's 
date of birth, there still remains the presumption (reasonable 
enough, I suggest, to be at least taken into account in any 
thorough consideration of the matter) that the very pre- 
cision of the date of birth question should emphasize, both 
to enumerator and enumerated, the necessity of an accurate 
and precise answer to the age question. 

2. The date of birth inquiry was recommended by the 
International Statistical Congress in 1872, and no subse- 
quent international statistical gathering has reversed or 
modified the recommendation. 

3. In the majority of European censuses, and especially 
in those countries* in which statistical practice has reached 
the highest level, date of birth, rather than age, is asked, f 

* Austria, Belgium, Germany, Holland, Hungary, Italy, Norway, Sweden and Switzer- 
land may be mentioned. 

1 1 do not mean to suggest that European census methods are inherently better than our 
own. But European practice in these matters affords a fairly good indication of where the 
weight of the opinion of qualified statistical experts may be supposed to rest. Ithink it 
worth while in this connection to quote one writer whose deservedly high reputation rests 
upon achievements in both statistical administration and statistical analysis: 

"Asking the age directly is the less common method, and is becoming obsolete, although 
it was used in the English and French censuses of 1891. Usually it is limited to the deter- 
mination of the number of completed years of life. Since censuses are not often taken 
at the end of a calendar year, the answers to this question do not permit us to ascertain the 
distribution of the population in objective groups of calendar years of birth. Moreover, 
this form of question increases the degree of uncertainty in the answers, for the age of an 
individual is a changing fact, every now and then to be reckoned anew. Mistakes happen 
in such reckonings, and it is even more often the case that the computations are not ser- 
iously undertaken, but round numbered estimates are set down instead. Furl her diUK'nlite* 
spring from the fact that this question tends to confuse the year of age, proix>rl> -o eultcd 
(that is, the sum of the years of life entirely completed and passed by), with thr \ enrol life, 
properly so called (that is, the year of life in which an individual yet remains, not having 
fully completed it). The question as to the immutable fact Of the date of birth (year, 
month and day) is therefore distinctly more to the point, and is to be preferred wherever 
the level of popular education is high enough to permit of its being generally answered. 
To ask merely the calendar year of birth does not serve the purpose, and is also attended 
by the difficulty that it does not make it possible to determine the actual ages of the per- 
sons enumerated unless the census happens to be taken exactly at the end of the year. '— 
G. von Mayr, Bevdlkerungstatistik, p. 74. 



362 American Statistical Association. [76 

4. A comparison of European censuses in which the age at 
last birthday is asked with those in which the date of birth 
is asked shows that the age statistics of the latter are in 
general more accurate than those of the former.* 

5. The age returns of the United States Census of 1900 
were distinctly more accurate than those of any previous 
federal census. It is important to note that this increased 
accuracy showed itself in four ways: (1) There was a decrease 
in the overstatement of the ages of young children.f (2) 
There was a decided lessening of the concentration of reported 
ages on years constituting multiples of five and ten.J (3) 
There was as compared with prior censuses a general smooth- 
ing of the whole age series§. This improvement, as meas- 
ured by the simple index which I have called the "coefficient 
of error" involved more than the mere reduction of the con- 
centration on round numbers, for while in this last respect 
the improvement in the census of 1890 over that of 1880 
was as marked as the improvement in 1900 over 1890, the 
"coefficient of error" was only 3.4 per cent, in 1900 as against 
7.5 in 1890 and 8.2 in 1880. (4) The number of reported 
centenarians was reduced from 8.0 per 100,000 population 
in 1880 and 6.4 per 100,000 in 1890 to 4.6 per 100,000 in 1900. 

II. 

The investigation of Professor Bailey and Mr. Parmelee 
is based on a careful examination of original schedules of 
the census of 1900 from five large cities and five rural coun- 
ties, embracing in all 130,000 enumerated persons. It was 
found that in 12,526 cases, — a little less than ten per cent, 
of the entire number, — the reported date of birth disagreed 
with the reported age. In some cases other facts reported 
regarding the person enumerated were prima facie evidence 
as to which of the two discordant statements was correct, or 

* I have given such a comparison, ao far as the statistics of children's ages are concerned 
in these Publications, Vol. VII, p. 237, and in Twelfth Census, Supplementary Analysis, p. 
140. I have also applied simple tests to the ages of adults as reported in the same cen- 
suses. The comparison indicated the superior accuracy of the date of birth inquiry, but 
the results seemed scarcely worth publishing, as I thought (I confess) that the matter was 
hardly open to question. The statement made in the text is, however, one which may 
easily be verified. 

t Twelfth Census, Supplementary Analysis, pp. 139-143. 

t Ibid., p. 136. 

§ Ibid., pp. 134-137. 



77] The Census Age Question. 363 

more nearly correct. In forty-four out of a random 
selection of fifty of such cases it was found that the reported 
age was more nearly correct than the reported date of birth. 
Of the eight cases given in full as being typical, seven involve, 
beyond reasonable doubt, errors of one kind and another in 
the statement of the date of birth. These facts seem to 
indicate clearly that where there are discrepancies between 
the reported dates of birth and the reported ages and where 
these reports can be controlled by other facts reported in 
the schedules, the reported ages are in general more trust- 
worthy than the reported dates of birth. A corollary of this 
finding is that where there were such discrepancies, the date 
of birth was more often estimated from the age than the age 
from the date of birth. 

It is not a necessary additional inference, however, that 
where the reported age and the reported date of birth agreed 
(that is, in over nine tenths of the 130,000 cases examined) 
the date of birth was more often the dependent and age the 
independent statement.* In the first place it is obvious that 
where neither age nor date of birth were known, the age must 
have been roughly estimated first, and then the date of birth 
computed. Doubtless this happened frequently in the 
enumeration of the more ignorant classes of* the population, 
and in other classes whenever the information about the 
persons enumerated was not given by themselves or by 
members of their immediate families. In the second place, what- 
ever the attending circumstances were, every case of dis- 
agreement between the reported age and the reported date 
of birth is a mark of carelessness on the part of the enume- 
rator. Now there is unmistakable evidence in the age 
returns themselves that where enumerators were working 
among an illiterate class of the population they lowered their 
own standards of accuracy, f The same effect would have 
been produced, I imagine, where for any other reason, such 
as linguistic barriers, or inability to get returns directly from 

* There is the further possibility that in many cases both date of birth and age were 
accurately known and independently stated. The essential point, after all, is the effect of 
the date of birth question upon the precision of the returns. This is more important 
than the question of the dependence or independence of one or the other statement. 

t Twelfth Census, Supplementary Analysis, pp. 136, 137. 



364 American Statistical Association. [78 

more than a small fraction of the persons enumerated, the 
enumerators found that their age returns were mere guesses. 
It would have seemed futile to take great pains in setting 
down the dates of birth, when the ages (in such cases, at 
least, the first estimates) were known to have little precision. 
The upshot of these considerations is to indicate a more or 
less cogent possibility that the 12,526 returns, in which dis- 
agreements between the reported dates of birth and reported 
ages were found, were not fairly representative of the 130,000 
cases examined (and, a fortiori, of the age returns of the 
Twelfth Census in general), but that they included a somewhat 
larger proportion of inaccurately known ages, and, hence, 
of cases in which the date of birth was estimated more or less 
carelessly from the reported age. There is still another pos- 
sibility that is at least worth mentioning: namely, that, as 
it was understood by the enumerators that the date of birth 
inquiry was only a method of ascertaining age, computations 
from the reported age back to the date of birth were on 
that account less carefully executed than computations from 
the reported date of birth to the age. This might help explain 
the prevalence of the first kind of computations among the 
cases of disagreement. 

More than two thirds of the 12,526 cases of disagreement 
were, however, of a specialized type, — that is, in over two 
thirds of these instances the reported age was just one year 
greater than would have been consistent with the reported 
date of birth. In these cases there was not enough discrep- 
ancy between the two reports to permit their being often 
tested by their harmony or lack of harmony with other 
facts on the schedules. I think it fair to infer (and the infer- 
ence is supported by the eight cases previously mentioned as 
given in detail) that the fifty tested cases of disagreement 
were drawn in large part from the miscellaneous one third 
of the cases of disagreement and in small part from the 
specialized two thirds. This at once raises a doubt whether 
the inference drawn from the fifty controlled cases (that in 
cases of disagreement the age is more often correctly stated 
than the date of birth) can be extended to these predominant 



79] The Census Age Question. 365 

and specialized cases of disagreement. Professor Bailey* 
seems to have experienced this doubt, for although he is 
convinced that in these cases the date of birth was computed 
(erroneously) from the age, he rests his case on what seem 
to him the inherent probabilities of the matter, f supported 
by the fact that in several cases there is apparently clear 
evidence that the enumerator had computed and entered the 
years of birth at the close of his day's field work. 

But there is a further peculiarity in this class of incom- 
patible returns, for "it was found that in all these cases almost 
without exception the birth month was one of the last seven 
months of the year." As the information on the census 
schedules is supposed to represent the ages as they were on 
June 1 of the census year, it is obvious that there is some 
relation between this concentration of discrepant dates of 
birth and the date of the census. To borrow a concrete illus- 
tration from the paper under review: "X is returned on the 
schedule as born in September, 1865, and as being 35 years 
of age. . . . Therefore, if the date of birth was correct, 
X was not 35 years of age on June 1, 1900, but only 34; if 
on the other hand he was, in fact, 35 years of age in June, 
1900, and was born in September, then the year of his birth 
was not 1865, but 1864." Now if we proceed on the assump- 
tion that in this and similar cases of disagreement, either 
the reported age or the reported date of birth, the one or the 
other, accurately represented the truth with respect to the 
person enumerated, I see no way in which we can, without 
begging the question, assume that the age was computed 
from the date of birth, or, per contra, that the date of birth 

* This part of the article under review is written in the first person singular. 

t " To the writer, however, this [that the year of age was obtained in these cases by sub- 
traction from the date of birth] seems most unlikely, because it is his belief and observation 
that most people keep a better mental record of their age than of their year of birth, and 
will answer more rapidly and promptly an inquiry as to age in years than as to year of 
birth, whether the inquiry applies to themselves or to some relative, friend, or acquaint- 
ance." — loc. cit., p. 114. 

I am inclined to agree with Professor Bailey's opinion so far as it relates to the ages of 
"relatives, friends, or acquaintances." So far as one's own age is concerned I have ex- 
pressed above a variant opinion, which, however, May be taken as relating only to the 
mental habits of intelligent persons of mature age. But, as we shall see later, the point is 
really of minor importance. It may be of interest, however, to note that in European 
censuses in which the date of birth is asked and which are taken in such years that round 
numbered years of birth do not coincide with round numbered ages, the concentration is 
on round numbered years of birth rather than on round numbered ages. This seems to 
prove conclusively that in those countries, at least, the date of birth is not very frequently 
estimated from the age. (See especially the analysis bf age statistics in the Swiss census of 
1888.) 



366 American Statistical Association. [80 

was computed from the age. If, however, some weight can 
be conceded to the tentative suggestion that these cases of 
disagreement constitute an especially inaccurate set of age 
returns, it becomes plausible that the ages were in general 
independently though loosely stated, and the date of birth 
computed in a careless manner from the reported age. 

The census was taken in a round numbered year and there 
was a partial coincidence between round numbered years of 
birth and round numbered ages. It would seem at least possi- 
ble that in cases where a round numbered age was set down as 
a loose approximation, a round numbered date of birth should 
have been set down in similarly careless fashion. The writers 
fortunately furnish a table showing the distribution of the 
final digits of the reported ages in 8,851 cases of these one- 
year discrepancies. Twenty-five per cent, of these reported 
ages end with either or 5. The corresponding per cent, 
for the entire population in 1900 was 21.2.* The difference 
is not great, and I am not sure that it is at all significant, for 
the one-year discrepancies may have been especially signifi- 
cant in the reported ages of adults, where the concentration 
on multiples of 5 is most noticeable. So my hypothesis 
seems to fall to the ground so far as these one-year discrep- 
ancies are concerned, and I see no objective method of deter- 
mining for these cases whether the reported date of birth or 
the reported age should be considered more frequently correct. 

III. 

Professor Bailey and Mr. Parmelee have, however, compiled 
a table which, if interpreted in a certain way, seems to add 
considerable weight to their contention that ages, rather than 
dates of birth, were independently stated. This table shows 
the proportions in which ages reported as between twenty- 
three and sixty-two years inclusive were distributed among 
the years ending in each of the ten digits. Such figures are 
given for the aggregate population as reported in 1900, 1890, 
and 1880, and for the population in 1900 as redistributed by 

* This is the per cent, which reported ages ending in or 5 made of the aggregate number 
of persons reported as one year old or over. The number of children reported as under one 
is excluded from the computation as this number could not have included any one-year 
age discrepancies. 



■81] The Census Age Question. 367 

interpolation within the successive five-year age groups. 
This device shows very clearly the reduced concentration 
on multiples of five in 1900, and it also shows the correspond- 
ing relative increase in the numbers reported at certain other 
ages. The significant thing is an especially noticeable swell- 
ing of the numbers reported at ages ending in 9 or 4. From 
what we know about the nature of the errors in age returns 
we would have expected this decrease in the concentration 
on round numbers to fill up the ages immediately above the 
round numbers rather more than the ages immediately below 
them.* Yet the table seems to show that the apparently less 
probable result happened. 

The explanation offered by Professor Bailey and Mr. 
Parmelee is essentially as follows: When the schedules were 
edited in the census office, such cases of discrepancies between 
the two age reports as were noticed were adjusted on the 
assumption that the date of birth was the more correct return. 
As a result, when the reported age was a year greater than 
the reported date of birth warranted, it was reduced by one 
year. And since there was a concentration of the reported 
ages on round numbers, by this process of adjustment the 
years ending in 4 and 9 lost less to years ending in 3 and 8 
than they gained from years ending in 5 and 0. Similarly, 
years ending in 5 and lost more to years ending in 4 and 9 
than they gained from years ending in 4 and 9. If this expla- 
nation is adequate, the abnormal swelling of the ages ending in 
4 and 9 indicates, it is thought, that the corrections made in 
the census office were not well advised, the reported ages 
being more accurate than the reported date of birth. It also 
indicates, — what is even more to the point, — that part of 
the diminished concentration on round numbers in 1900 is 
apparent rather than real, having been brought about by the 
arbitrary clerical process described, f 

In fact, Professor Bailey and Mr. Parmelee estimate (on the 
basis of the distribution of ages in 8,851 cases of one-year 

* See these Publications, Vol. VII, p. 38. 

t It should be noted, however, that even if the clerical adjustments described were well 
advised they would nevertheless have brought about an appreciable swelling of the numbers 
reported at ages ending with the digits 4 or 9. But in this case it would scarcely be accurate 
to call the accompanying reduction of the concentration on round numbered ages an " ap- 
parent " rather than a " real " improvement. 

6 



368 American Statistical Association. [82 

adjustments of this kind) that this process was responsible 
for a decrease of 2.9 per cent, in the concentration on round 
numbers in 1900. By this much, then, the addition of the 
date of birth inquiry would seem to have been credited with 
more than its real effect in the improvement of our age 
statistics. 

The difficulty with the foregoing explanation is that it 
by no means explains the total amount of increase in the rela- 
tive number of reported ages ending in 4 and 9. The number 
of persons reported by the census at ages ending with the 
digit 9 was (proportionately to the total number between 
twenty-three and sixty-two years) greater by 12 per cent, in 
1900 than it was in 1890. I estimate that the artificial shift- 
ing described above (if it was as frequent as Professor Bailey 
and Mr. Parmelee estimate) was responsible for between 2 
and 3 per cent, in this increase, that is, at most, not more than 
one fourth or one fifth of it. The general reduction in the 
concentration on round numbers was responsible for prob- 
ably not more than 1 per cent, out of the 12.* Altogether 
about two thirds or three fourths of this increase remains to 
be accounted for. In other words, this statistical evidence 
proves too much. 

There is another possible explanation of the swelling of the 
reported ages ending in the digits 9 and 4 which appeals to me 
as less labored and more adequate. The date of birth inquiry 
preceded the age inquiry on the schedule used by the enumer- 
ators in 1900. It is reasonable to suppose that in many 
if not the majority of instances the enumerators asked the 
date of birth before they asked the age. Among the answers 
to the date of birth inquiry there must have been, as in 
European census experience, more or less concentration on 
round numbered years. The more careful enumerators, at 
least, would have taken pains to see that the reported age 
agreed with the reported date of birth. Where the month of 
birth assigned happened to be one of the first five months of 

* A decrease of 8.7 per cent, in the relative number of reported ages ending in in 1890 
as compared with 1880 was accompanied by an increase in the relative number reported as 
ending with 9 of only 0.5 per cent. On this basis, the decrease of 14.2 per cent, in the 
number of reported ages ending in in 1900 as compared with 1890 would have been ac- 
companied by an increase of 0.8 per cent, in the relative number of reported ages ending 
with 9. 



83] The Census Age Question. 369 

the year, the reported age would also have been a round 
number, — that is, a multiple of five. But where the month 
of birth assigned was one of the last months of the year, the 
reported age would have been a year ending with either the 
digit 4 or the digit 9. An increase in the relative number of 
ages reported as ending in these digits is a necessary effect of 
the concentration of reported years of birth on round num- 
bers. I offer this as a probable explanation of the amount of 
this increase not otherwise accounted for, and as a possible 
explanation of substantially all of this increase not accounted 
for by the general reduction in the concentration on round 
numbered ages.* 

IV. 

The thesis of Professor Bailey and Mr. Parmelee thus loses 
what had been its statistical support. But let us proceed on 
the assumption that their estimate that 2.9 per cent, in the 
decrease of the concentration on round numbers in 1900 was 
only apparent is thoroughly in accord with the facts. To 
quote the article under review: "If the [clerical] reductions 
had not been made, therefore, the excess concentration in 
1900 would have been 22.7 per cent, and not 19.8 per cent." 
From this to the conclusion that the date of birth inquiry was 
not justified by its results is surely a non sequitur, — for the 
corresponding excess concentration in 1890 was 31.3 per cent., 
and in 1880 it was 44.8 per cent. In 1890, it should be re- 
membered, the instructions to enumerators were much more 
insistent in the warnings given against the tendency to report 
ages in round numbers than they were in either 1880 or 1900. 
I know of no cause, except the date of birth inquiry, to which 
the marked and very creditable improvement in the charac- 
ter of the age statistics in 1900 can be attributed. I have 
no confidence in the efficiency of any "general improvement 
in census methods" apart from the concrete specific steps 
which constitute this general improvement. 

It should be noted, finally, that if Professor Bailey and Mr. 
Parmelee have published the full results of their investiga- 

* If in these one-year discrepancies the reported dates of birth were in general more 
trustworthy than the reported ages, the effect of the clerical adjustments would simply 
constitute a special case, coming under the general explanation I have offered. 



370 American Statistical Association. [84 

tion they have neglected to estimate the amount of weight 
that should be given to the statistical practice of other countries, 
to the opinions of qualified experts in those countries, and to 
the recommendations of an international statistical gather- 
ing of the highest authority. They have neglected to com- 
pare the results obtained in censuses in which one form of 
question is used with the results of censuses using the other 
form. They have not reckoned with the effect of the form 
of the age inquiry upon the overstatement of the ages of 
children, upon the "general smoothness" of the age series, or 
upon the overstatement of the ages of persons advanced in 
years. They have shown, however, that the effect of the date 
of birth inquiry in the reduction of the concentration on 
round numbers was possibly not quite so great as had been 
supposed. But I see nothing to justify their final conclusion 
that "the inquiry as to date of birth played little or no part, 
either in increasing the accuracy of the age returns of the 
Twelfth Census, or in reducing the concentration on years 
ending in the integers 5 and 0." And I fear that a step back- 
ward in statistical practice was taken when "especially in 
view of the [foregoing] conclusion, ... it was decided to 
eliminate the query regarding date of birth from the 
population schedule of the Thirteenth Census and to retain 
only the question regarding age at last birthday." 



